[SPARK-56661] Implementing UDFDispatcherManager for new UDF worker sessions#55712
[SPARK-56661] Implementing UDFDispatcherManager for new UDF worker sessions#55712sven-weber-db wants to merge 1 commit intoapache:masterfrom
Conversation
83e1033 to
184fc44
Compare
|
|
||
| // Must be called while holding `lock`. | ||
| private def handleSessionTermination( | ||
| workerSpec: UDFWorkerSpecification |
There was a problem hiding this comment.
maybe we shall also pass the session object here?
There was a problem hiding this comment.
Do you mean move the activeSessions.remove(session) call into this function like so?
private def handleSessionTermination(
session: WorkerSession,
workerSpec: UDFWorkerSpecification
): Unit = {
activeSessions.remove(session)
val entry = dispatchers.get(workerSpec)
// Note: entry == null is unexpected and should
// throw here.
entry.activeSessionCount -= 1
if (entry.activeSessionCount == 0) {
logger.info("All sessions closed for dispatcher " +
s"${entry.dispatcher.dispatcherId}, removing from cache")
dispatchers.remove(workerSpec)
onAllDispatcherSessionsClosed(entry.dispatcher)
}
}c1e56ab to
7c77e8d
Compare
7c77e8d to
1f11959
Compare
| * Not called during [[UDFDispatcherManager#stop]] -- the manager | ||
| * cleans up dispatchers it holds directly in that case. | ||
| */ | ||
| def onAllDispatcherSessionsClosed( |
There was a problem hiding this comment.
it may be better to include this per dispatcher, rather in this factory.
There was a problem hiding this comment.
In my view the point of this function is to decide how long a dispatcher is being kept around. E.g. in a scenario where clients provide different worker specs, we cannot keep every dispatcher for the whole Spark runtime. There needs to be some sort of timeout, especially once we implement warm-pooling.
Should this dispatcher timeout be controlled by each dispatcher instead of a central place? The re-use logic may also depend on how many dispatchers there currently are. E.g. if we only need one dispatcher in the whole Spark lifetime there is no need for eager cleanup. Therefore, I would argue its more suitable to decide this in the factory, which has knowledge of the number and kind of dispatchers that are currently in existence
| * clean up any dispatchers/resources they hold beyond what the | ||
| * [[UDFDispatcherManager]] manages. | ||
| */ | ||
| def onStop(): Unit |
There was a problem hiding this comment.
same as this - factory pattern means it only defines the object creation logic, onStop should be added to dispatcher instance.
| private val lock = new Object | ||
| private val dispatchers = | ||
| new HashMap[UDFWorkerSpecification, DispatcherEntry]() | ||
| private val activeSessions = new ArrayList[WorkerSession]() |
There was a problem hiding this comment.
These logic looks like more dispatcher impl. details - e.g. a proper implemented dispatcher should manage active sessions properly.
A dispatcher spawns workers and creates sessions, then it's more natural a dispatcher manages the sessions it starts.
There was a problem hiding this comment.
For example, the dispatcher manager here creates dispatcher, then it takes care of disposal of dispatchers.
A dispatcher spawns workers and sessions, then the disposal of session and workers should live in a dispatcher.
Because creation and disposal logic can be related, it's better to colocate them together.
What changes were proposed in this pull request?
This PR implements a
UDFDispatcherManagerclass in the new/udfpackage that was initiated by SPIP SPARK-55278. The purpose of the new Manager class is to provide a single entry-point for Spark with which a UDF session to a external UDF worker can be created, based on aWorkerSpecificationinstance. This manager and entry-point will be used by follow-up PRs to implement new, language agnostic Catalyst nodes.Why are the changes needed?
The
UDFWorkerManagerserves two main purposes:WorkerDispachterclasses - depending on theUDFWorkerSpecificationthey are created for. This is required because the newly proposed UDF framework from SPIP SPARK-55278, enables clients to specify different UDF dispatchers for their UDFs. This implies:2.1. Multiple, different dispatchers can exist at the same time
-> The right one needs to be selected to create a UDF session
2.2. Dispatcher lifetime needs to be managed
-> Dispatchers and their resources need to be cleaned-up if they are no longer needed by clients
Does this PR introduce any user-facing change?
No - All changes are marked as
Experimentaland not yet consumed.How was this patch tested?
New unit-tests where added for the changes in the
UDFDispatcherManagerandWorkerSessionWas this patch authored or co-authored using generative AI tooling?
Partially. However, the code was manually reviewed and adjusted.